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There is a hidden intrigue in the title. CT is one of the most abstract mathematical disciplines, some- 
times nicknamed "abstract nonsense". MDE is a recent trend in software development, industrially 
supported by standards, tools, and the status of a new "silver bullet". Surprisingly, categorical pat- 
terns turn out to be directly applicable to mathematical modeling of structures appearing in everyday 
MDE practice. Model merging, transformation, synchronization, and other important model man- 
agement scenarios can be seen as executions of categorical specifications. 

Moreover, the paper aims to elucidate a claim that relationships between CT and MDE are more 
complex and richer than is normally assumed for "applied mathematics". CT provides a toolbox of 
design patterns and structural principles of real practical value for MDE. We will present examples 
of how an elementary categorical arrangement of a model management scenario reveals deficiencies 
in the architecture of modern tools automating the scenario. 
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1 Introduction 



There are several well established applications of category theory (CT) in theoretical computer science; 
typical examples are programming language semantics and concurrency. Modern software engineering 
(SE) seems to be an essentially different domain, not obviously suitable for theoretical foundations based 
on abstract algebra. Too much in this domain appears to be ad hoc and empirical, and the rapid progress 
of open source and collaborative software development, service-oriented programming, and cloud com- 
puting far outpaces their theoretical support. Model driven (software) engineering (MDE) conforms to 
this description as well: the diversity of modeling languages and techniques successfully resists all at- 
tempts to classify them in a precise mathematical way, and model transformations and operations — 
MDE's heart and soul — are an area of a diverse experimental activity based on surprisingly weak (if 
any) semantic foundations. 

In this paper we claim that theoretical underpinning of modern SE could (and actually quite naturally) 
be based on CT. The chasm between SE and CT can be bridged, and MDE appears as a "golden cut", 
in which an abstract view of SE realities and concrete interpretations of categorical abstractions merge 
together: SE — > MDE <— CT. The left leg of the cospan is extensively discussed in the MDE literature 
(see H71 and references therein); prerequisites and challenges for building the right leg are discussed in 
the present paper. Moreover, we aim to elucidate a claim that relationships between CT and MDE are 
more complex and richer than is normally assumed for "applied mathematics". CT provides a toolbox of 
design patterns and principles, whose added value goes beyond such typical applications of mathematics 
to SE as formal semantics for a language, or formal analysis and model checking. 
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Two aspects of the CT-MDE "marriage" are discussed in the paper. The first one is a standard ar- 
gument about the applicability of a particular mathematical theory to a particular engineering discipline. 
To wit, there is a mathematical framework called CT, there is an engineering domain called MDE, and 
we will try to justify the claim that they make a good match, in the sense that concepts developed in the 
former are applicable for mathematical modeling of constructs developed in the latter. What makes this 
standard argument exciting is that the mathematical framework in question is known to be notoriously 
abstract, while the engineering domain is very agile and seemingly not suitable for abstract treatment. 
Nevertheless, the argument lies within the boundaries of yet another instance of the evergreen story of 
applying mathematics to engineering problems. Below we will refer to this perspective on the issue as 
Aspect A. 

The second perspective (Aspect B) is less standard and even more interesting. It is essentially based 
on specific properties of categorical mathematics and on the observation that software engineering is 
a special kind of engineering. To wit, CT is much more than a collection of mathematical notions and 
techniques: CT has changed the very way we build mathematical models and reason about them; it can be 
seen as a toolbox of structural design patterns and the guiding principles of their application. This view 
on CT is sometimes called arrow thinking. On the other hand, SE, in general, and MDE, in particular, 
essentially depend on proper structuring of the universe of discourse into subuniverses, which in their 
turn are further structured and so on, which finally results in tool architectures and code modularization. 
Our experience and attempts to understand complex structures used in MDE have convinced us that 
general ideas of arrow thinking, and general patterns and intuitions of what a healthy structure should be, 
turn out to be useful and beneficial for such practical concerns as tool architecture and software design. 

The paper is structured as follows. In Section [2] we present two very general A-type arguments 
that CT provides a "right" mathematical framework for SE. The second argument also gives strong 



prerequisites for the B-side of our story. Section 3.1 gives a brief outline of MDE, and Section 3.2 



reveals a truly categorical nature of the cornerstone notions of multimodeling and intermodeling (another 
A-argument). In Section [4] we present two examples of categorical arrangement of model management 
scenarios: model merge and bidirectional update propagation. This choice is motivated by our research 
interests and the possibility to demonstrate the B-side of our story. In Section|5]we discuss and exemplify 
three ways of applying CT for MDE: understanding, design patterns for specific problems, and general 
design guidance on the level of tool architecture. 



2 Two very general perspectives on SE and Mathematics 
2.1 The plane of Software x Mathematics 

The upper half of Fig. [T] presents the evolution of software engineering in a schematic way, following 
Mary Shaw ll48Tl and Jose Fiadeiro |[25l . Programming-in-the-head refers to the period when a software 
product could be completely designed, at least in principle, "inside the head" of one (super intelligent) 
programmer, who worked like a researcher rather than as an engineer. The increasing complexities of 
problems addressed by software solutions (larger programs, more complex algorithms and data struc- 
tures) engendered more industrially oriented/engineering views and methods (e.g., structured program- 
ming). Nevertheless, for Programming-in-the-small, the software module remained the primary goal 
and challenge of software development, with module interactions being simple and straightforward (e.g., 
procedure calls). In contrast, Programming-in-the-large marks a shift to the stage when module compo- 
sition becomes the main issue, with the numbers of modules and the complexity of module interaction 
enormously increased. This tendency continued to grow and widened in scope as time went on, and 
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Figure 1 : Evolution of software (M refers to Many /Multitude) 



today manifests itself as Programming-in-the-world. The latter is characterized by a large, and grow- 
ing, heterogeneity of modules to be composed and methods for their composition, and such essentially 
zdmodiffyin-the-largelarge modern technologies as service orientation, open source and collaborative 
software development, and cloud computing. 

The lower part of Fig. [T]presents this picture in a very schematic way as a path from 1 to M to M n 
with M referring to multiplicity in different forms, and degree n indicating the modern tendencies of 
growth in heterogeneity and complexity. 

MDE could be seen as a reaction to this development, a way of taming the growth of n in a system- 
atic way. Indeed, until recently, software engineers may feel that they could live without mathematical 
models: just build the software by whatever means available, check and debug it, and keep doing this 
throughout the software's life. (Note that the situation in classical (mechanical and electrical) engineer- 
ing is essentially different: debugging, say, a bridge, would be a costly procedure, and classical engineers 
abandoned this approach long time ago.) But this gift of easily built systems afforded to SEs is rapidly 
degrading as the costs of this process and the liability from getting it wrong are both growing at an enor- 
mous rate. By slightly rephrasing Dijkstra, we may say that precise modeling and specification become 
a matter of death and life rather than luxury. 

These considerations give us the verti- 
cal axis in Fig. [2| skipping the intermedi- 
ate point. The horizontal axis represents 
the evolution of mathematics in a simi- 
lar simplified way. Point 1 corresponds to 
the modern mathematics of mathematical 
structures in the sense of Bourbaki: what 
matters is operations and relations over 
mathematical objects rather than their in- 
ternal structure. Skipped point M corre- 
sponds to basic category theory: the in- 
ternal structure of the entire mathematical 
structure is encapsulated, and mathemat- 
ical studies focus on operations and rela- 
tions over structures considered as holistic 
entities. The multitude of higher degree, 
M°°, refers to categorical facilities for reflection: enrichment, internalization, higher dimensions, which 
can be applied ad infinitum, hence, oo-degree. 

This (over-simplified) schema gives us four points of MathxSE interaction. Interaction (1,1) turned 
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out to be quite successful, as evidenced by such theory-based practical achievements as compilers, model 
checking, and relational DB theory. As for the point ( 1 , M n ), examining the literature shows that attempts 
at building theoretical foundations for MDE based on classical 1 -mathematics were not successful. A 
major reason seems to be clear: 1 -mathematics does not provide an adequate machinery for specifying 
and reasoning about inter-structural relationships and operations, which are at the very heart of modern 
software development. This point may also explain the general skepticism that a modern software en- 
gineer, and an academic teaching software engineering, feel about the practicality of using mathematics 
for modern software design: unfortunately, the only mathematics they know is the classical mathematics 
of Bourbaki and Tarski. 

On the other hand, we view several recent applications of categorical methods to MDE problems 
[5l H7l [37l EH |45l |44l [22l [191 |43l |46j as promising theoretical attempts, with great potential for 

practical application. It provides a firm plus for the (M°°,M n ) -point in the plane. 

Moreover, as emphasized by Lawvere, the strength of CT based modeling goes beyond modeling 

multi-structural aspects of the mathematical universe, and a categorical view of a single mathematical 

structure can be quite beneficial too. This makes point (M°°, 1) in the plane potentially interesting, and 

indeed, several successful applications at this point are listed in the figure. 



2.2 Mathematical modeling of engineering artifacts: Round-tripping abstraction vs. wa- 
terfall based abstraction 
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Figure Fig. |3ja) shows a typical way of 
building mathematical models for mechan- 
ical and electrical engineering domains. 
Meta-mathematics (the discipline of model- 
ing mathematical models) is not practically 
needed for engineering as such. The situa- 
tion dramatically changes for software engi- 
neering. Indeed, category theory (CT) could 
be defined as a discipline for studying mathe- 
matical structures: how to specify, relate and 
manipulate them, and how to reason about 
them. In this definition, one can safely re- 
move the adjective "mathematical" and con- 
sider CT as a mathematical theory of struc- 
tures in a very broad sense. Then CT be- 
comes directly applicable to SE as shown 
in Fig. |3jb). Moreover, CT has actually 

changed the way of building mathematical structures and thinking about them, and found extensive and 
deep applications in theoretical computer science. Hence, CT can be considered as a common theoretical 
framework for all modeling stages in the chain (and be placed at the center). In this way, CT provides a 
remarkable unification for modeling activities in SE. 

The circular, non linear nature of the figure also illustrates an important point about the role of CT 
in SE. Because software artifacts are conceptual rather than physical entities, there is potential for feed- 
back between SE and Mathematics in a way that is not possible in traditional scientific and engineering 
disciplines. Design patterns employed in SE can be, and have been, influenced by mathematical model 
of software and the way we develop them. 



(b) Circular modeling chain 
Figure 3: Modeling chains 
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We will begin with a rough general schema of the MDE approach to building software (Section 3.1), and 
then will arrange this schema in categorical terms (Section 3.2). 



3.1 MDE in a nutshell 

The upper-left corner of Fig. [4] shows a 
general goal of software design: build- 
ing software that correctly interacts with 
different subsystems of the world (shown 
by figures of different shapes). For ex- 
ample, software embedded in a car in- 
teracts with its mechanical, electrical and 
electronic subsystems, with the driver and 
passengers, and with other cars on the 
road in future car designs. These compo- 
nents interact between themselves, which 
is schematically shown by overlaps of the 
respective shapes. The lower-right corner 
of Fig. [4] shows software modularized in 
parallel to the physical world it should in- 
teract with. The passage from the left to the right is highly non-trivial, and this is what makes SE larger 
and more challenging than mere programming. An effective means to facilitate the transition is to use 
models — a system of syntactical objects (as a rule, diagrammatic) that serve as abstractions of the 
"world entities" as shown in the figure (note the links from pieces of World to the respective parts of 
Modelware). These abstractions are gradually developed and refined until finally transformed into code. 
The modelware universe actually consists of a series of "modelwares" — systems of requirement, analy- 
sis, and design models, with each consecutive member in the list refining the previous one, and in its own 
turn encompassing several internal refinement chains. Modelware development consumes intelligence 
and time, but still easier and more natural for a human than writing code; the latter is generated automat- 
ically. The main idea of MDE is that human intelligence should be used for building models rather than 
code. 

Of course, models have been used for building software long before the MDE vision appeared in 
the market. That time, however, after the first version of a software product had been released, its 
maintenance and further evolution had been conducted mainly through code, so that models had quickly 
become outdated, degraded and finally became useless. In contrast, MDE assumes that maintenance and 
evolution should also go through models. No doubts that some changes in the real world are much easier 
to incorporate immediately in the code rather than via models, but then MDE prescribes to update the 
models to keep them in sync with code. In fact, code becomes just a specific model, whose only essential 
distinction from other models in the modelware universe is its final position in the refinement chain. 
Thus, the Modelware boundary in Fig.[4]should be extended to encompass the Software region too. 




Figure 4: MDE, schematically 
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3.2 Modelware categorically 

Consider a modelware snapshot in Fig. [4] Notice that models as such are separated whereas their refer- 
ents are overlapped, that is, interact between themselves. This interaction is a fundamental feature of the 
real world, and to make the model universe adequate to the world, intermodel correspondences/relations 
must be precisely specified. (For example, the figure shows three binary relations, and one ternary rela- 
tion visualized as a ternary span with a diamond head.) With reasonable modeling techniques, intermodel 
relations should be compatible with model structures. The modelware universe then appears as a col- 
lection of structured objects and structure-compatible mappings between them, that is, as a categorical 
phenomenon. In more detail, a rough categorical arrangement could be as follows. 

The base universe. Models are multi-sorted structures whose theories are called metamodels. The 
latter can be seen as generalized sketches (39j|20]], that is, pairs M = (Gm,Cm) with Gm a graph (or, 
more generally, an object of an apiori fixed presheaf topos G), and Cm a set of constraints (i.e., diagram 
predicates) declared over Gm- An instance of metamodel M is a pair A = (Ga^a) with Ga another graph 
(an object in G) and tA '• Ga — > Gm a mapping (arrow in G) to be thought of as typing, which satisfy the 
constraints, A |= Cm (see ll20l for details). An instance mapping A — > B is a graph mapping / ' : Ga — >■ Gb 
commuting with typing: f;ts = tA- This defines a category Mod{M) C G/Gm of M-instances. 

To deal with the heterogeneous situation of models over different metamodels, we first introduce 
metamodel morphisms m: M ^ N as sketch morphisms, i.e., graph mappings m: Gm — > Gn compat- 
ible with constraints. This gives us a category of metamodels MMod. Now we can merge all cat- 
egories Mod{M) into one category Mod, whose objects are instances (= G-arrows) tA'. Ga — > G M (a)> 
?b : Gb — > G M (b) et c, each having its metamodel, and morphisms /: A — > B are pairs /data : Ga — > Gb, 
/meta : M(A) — > M(B) such that /data^s = ^;/meta> i-e., commutative squares in G. Thus, Mod is a sub- 
category of the arrow category G"*". 

It can be shown that pulling back a legal instance Ib '■ Gb — > Gw of metamodel N along a sketch 
morphism m:M N results in a legal instance of M [20]. We thus have a fibration p: Mod — > MMod, 
whose Cartesian lifting is given by pullbacks. 

Intermodel relations and queries. A typical intermodeling situation is when an element of one model 
corresponds to an element that is not immediately present in another model, but can be derived from other 
elements of that model by a suitable operation (a query, in the database jargon) |[T9ll . Query facilities can 
be modeled by a pair of monads (Qdef, Q) over categories MMod and Mod, resp. The first monad 
describes the syntax (query definitions), and the second one provides the semantics (query execution). 

A fundamental property of queries is that the original data are not affected: queries compute new 
data but do not change the original. Mathematical modeling of this property results in a number of 
equations, which can be summarized by saying that monad Q is p-Cartesian, i.e., the Cartesian and 
the monad structure work in sync. If can be shown |fT9l that a query language (Q, Qdef) gives rise to a 
fibration pq : ModQ — > MModQ i<:S between the corresponding Kleisli categories. These Kleisli categories 
have immediate practical interpretations. Morphisms in MModQ ia are nothing but view definitions: they 
map elements of the source metamodel to queries against the target one. Correspondingly, morphisms in 
ModQ are view executions composed from query execution followed by retyping. The fact that projection 
Pq is fibration implies that the view execution mechanism is compositional: execution of a composed 
view equals the composition of executions. 

Now a correspondence between models A,B over metamodels M,N can be specified by data shown 
in Fig. [5j these data consist of three components.. ( 1) span (m:N <= MN, n : MN N) (whose legs are 
Kleisli mappings) specifies a common view MN between the two metamodels. (2) trapezoids (arrows 
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in Modo) are produced by pQ-Cartesian "lifting", i.e., by executing views m and n for models A and B 
resp., which results in models A\ m and B\ n (here and below we use the following notation: computed 
nodes are not framed, and computed arrows are dashed). (3) span (p : A \ m <— AB, q : AB — > B \ n ) specifies 
a correspondence between the views. Note that this span is an independent modelware component and 
cannot be derived from models A,B. 

Spans like in Fig. [5] integrate a col- 
lection of models into a holistic system, 
which we will refer to as a multimodel. 
Examples, details, and a precise defini- 
tion of a multimodel's consistency can be 
found in [21 ]. 

It is tempting to encapsulate spans in 
Fig. [5] as composable arrows and work 
with the corresponding (bi)categories of 
metamodels and models. Unfortunately, 
it would not work out because, in gen- 
eral, Kleisli categories are not closed un- 
der pullbacks, and it is not clear how to compose Kleisli spans. It is an important problem to overcome 
this obstacle and find a workable approach to Kleisli spans, 

Until the problem above is solved, our working universe is the Kleisli category of heterogeneous 
models fibred over the Kleisli category of metamodels. This universe is a carrier of different operations 
and predicates over models, and a stage on which different modeling scenarios are played. Classification 
and specification of these operations and predicates, and their understanding in conventional mathemati- 
cal terms, is a major task of building mathematical foundations for MDE. Algebraic patterns appear here 
quite naturally, and then model management scenarios can be seen as algebraic terms composed from 
diagram-algebra operations over models and model mappings^ The next section provides examples of 
such algebraic arrangements. 
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Figure 5: Correspondences between heterogeneous models 



4 Model management (MMt) and algebra: Two examples 

We will consider two examples of algebraic modeling of MMt scenarios. A simple one — model merg- 
ing, and a more complex and challenging — bidirectional update propagation (BX). 

4.1 Model merge via colimit 

Merging several interrelated models without data redundancy and loss is an important MDE scenario. 
Models are merged (virtually rather than physically) to check their consistency, or to extract an integrated 
information about the system. A general schema is shown in Fig. [6] Consider first the case of several 
homogeneous models A,B,C... to be merged. The first step is to specify correspondences/relations be- 
tween models via Kleisli spans Rl,R2, or perhaps direct mappings like r3. The intuition of merging 
without data loss and redundancy (duplication of correspondent data) is precisely captured by the uni- 
versal property of colimits, that is, it is reasonable to define merge as the colimit of a diagram of models 
and model mappings specifying intermodel correspondences. 

'Note, however, that a proper categorical treatment of these operations in terms of universal constructions can be not 
straightforward. 
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If models are heterogeneous, their relations are specified as in 
Fig. [5] To merge, we first merge metamodels modulo metamodel 
spans. Then we can consider all models and heads of the corre- 
spondence spans as instances of the merged metamodel, and merge 
models by taking the colimit of the entire diagram in the category of 
instances of the merged metamodel. 

An important feature of viewing model merge as described 
above is a clear separation of two stages of the merge process: (i) 
discovery and specifying intermodel correspondences (often called 

model matching), and (ii) merging models modulo these correspon- Figure 6: Model merge 

dences. The first stage is inherently heuristic and context dependent. 

It can be assisted by tools based on Al-technologies, but in general a user input is required for final 
adjustment of the match (and of course to define the heuristics used by the tool). The second stage is 
pure algebra (colimit) and can be performed automatically. The first step may heavily depend on the 
domain and the application, while the second one is domain and application independent. However, a 
majority of model merge tools combine the two stages into a holistic merge algorithm, which first some- 
how relates models based on a specification of conflicts between them, and then proceeds accordingly 
to merging. Such an approach complicates merge algorithms, and makes a taxonomy of conflicts their 
crucial component; typical examples are ||49ll42ll . 

The cause of this deficiency is that tool builders rely on a very simple notion of model matching, 
which amounts to linking the-same-semantics elements in the models to be matched. However, as dis- 
cussed above in Section 3.2 for an element e in model A, the-same-semantics B-element e' can only be 
indirectly present in B, i.e., e' can be derived from other elements of B with a suitable operation (query) 
over B rather than being an immediate element of B. With complex (Kleisli) matching that allows one to 
link basic elements in one model with derived elements in another model, the algebraic nature of merge 
as such (via the colimit operation) can be restored. Indeed, it is shown in [9] that all conflicts considered 
in Il42l can be managed via complex matching, that is, described via Kleisli spans with a suitable choice 
of queries, afterwards merge is computed via colimit. 



4.2 Bidirectional update propagation (BX) 

Keeping a system of models mutually consistent (model synchronization) is vital for model-driven en- 
gineering. In a typical scenario, given a pair of inter-related models, changes in either of them are to 
be propagated to the other to restore consistency. This setting is often referred to as bidirectional model 
transformation (BX) [0. 



4.2.1 BX via tile algebra 

A simple BX-scenario is presented in Fig. |7Ja). Two models, A and B, are interrelated by some cor- 
respondence specification r (think of a span in a suitable category, or an object in a suitable comma 
category, see lf2Tl for examples). We will often refer to them as horizontal deltas between models. In 
addition, there is a notion of delta consistency (extensionally, a class of consistent deltas), and if r is 
consistent, we call models A and B synchronized. 

Now suppose that (the state of) model B has changed: the updated (state of the) model is B' , and 
arrow b denotes the correspondence between B and B' (a vertical delta). The reader may think of a span, 
whose head consists of unchanged elements and the legs are injections so that B's elements beyond the 
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Figure 7: BX scenario specified in (a) delta-based 
and (b) state-based way 



range of the upper leg are deleted, and B"s elements beyond the range of the lower leg are inserted. 
Although update spans are denoted by bidirectional arrows, the upper node is always the source, and the 
lower is the target. 

Suppose that we can re-align models A and B' 
and compute new horizontal delta r * b (think of 
a span composition). If this new delta is not con- 
sistent, we need to update model A so that the up- 
dated model A' would be in sync with B' . More ac- 
curately, we are looking for an update a : A f-> A' 
such that the triple (A',/, B') is consistent. Of 
course, we want to find a minimal update a (with 
the biggest head) that does the job. 

Unfortunately, in a majority of practically in- 
teresting situations, the minimality condition is 
not strong enough to provide uniqueness of a. To 
achieve uniqueness, some update propagation pol- 
icy is to be chosen, and then we have an alge- 
braic operation bPpg ('b' stands for 'backward'), 

which, from a given a pair of arrows (b, r) connected as shown in the figure, computes another pair (a, r') 
connected with (b,r) as shown in the figure. Thus, a propagation policy is algebraically modeled by a 
diagram operation of arity specified by the upper square in Fig. |7Ja): shaded elements denote the input 
data, whereas blank ones are the output. Analogously, choosing a forward update propagation policy 
(from the A-side to the B-side) provides a forward operation fPpg as shown by the lower square. 

The entire scenario is a composition of two operations: a part of the input for operation applica- 
tion 2:fPpg is provided by the output of l:bPpg. In general, composition of diagram operations, i.e., 
operations acting upon configurations of arrows (diagrams), amounts to their tiling, as shown in the fig- 
ure; then complex synchronization scenarios become tiled structures. Details, precise definitions and 
examples can be found in lff5l . 

Different diagram operations involved in model synchronization are not independent and their in- 
teraction must satisfy certain conditions. These conditions capture the semantics of synchronization 
procedures, and their understanding is important for the user of synchronization tools: it helps to avoid 
surprises when automatic synchronization steps in. Fortunately, principal conditions (synchronization 
laws) can be formulated as universally valid equations between diagrammatic terms — a tile algebra 
counterpart of universal algebraic identities. In this way BX becomes based on an algebraic theory: 
a signature of diagram operations and a number of equational laws they must satisfy. The appendix 
presents one such theory — the notion of a symmetric delta lens, which is currently an area of active 
research from both a practical and a theoretical perspective. 



4.2.2 BX: delta-based vs. state-based 

As mentioned above, understanding the semantics of model synchronization procedures is important, 
both theoretically and practically. Synchronization tools are normally built on some underlying algebraic 
theory ll28l l53l l40l |4] [TJ |4TJ [30), and many such tools (the first five amongst those cited above) use 
algebraic theories based on state-based rather than delta-based operations. The state-based version of the 
propagation scenario in Fig. |7Ja) is described in Fig. |7Jb). The backward propagation operation takes 
models A, B,B', computes necessary relations between them (r and b on the adjacent diagram), and then 
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computes an updated model A'. The two-chevron symbol reminds us that the operation actually consists 
of two stages: model alignment (computing r and b) and update propagation as such. 

The state-based frameworks, although they may look simpler, actually hides several serious deficien- 
cies. Model alignment is a difficult task that requires contextual information about models. It can be 
facilitated by intelligent Al-based tools, or even be automated, but the user should have an option to step 
in and administer corrections. In this sense, model alignment is similar to model matching preceding 
model mergej^] Weaving alignment (delta discovery) into update (delta) propagation essentially compli- 
cates the semantics of the latter, and correspondingly complicates the algebraic theory. In addition, the 
user does not have an access to alignment results and cannot correct them. 



l:fPPE, 




Two other serious problems of the state-based frameworks and 
architectures are related to operation composition. The scenario de- 
scribed in Fig. |7Ja) assumes that the model correspondence (delta) 
used for update propagation 2:fPpg is the delta computed by opera- 
tion l:bPpg; this is explicitly specified in the tile algebra specifica- 
tion of the scenario. In contrast, the state-based framework cannot 
capture this requirement. A similar problem appears when we se- 
quentially compose a BX program synchronizing models A and B 
and another program synchronizing models B and C: composition 

amounts to horizontal composition of propagation operations as shown in Fig. [8] and again continuity, 
b\ = b2, cannot be specified in the state-based framework. A detailed discussion of delta- vs. state-based 
synchronization can be found in ll22l[T0l . 



Figure 8: State-based BX: erro- 
neous horizontal composition 



4.2.3 Assembling model transformations 
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Suppose M, N are two metamodels, and we need to transform M-instances (models) into Af-ones. Such a 
transformation makes sense if metamodels are somehow related, and we suppose that their relationship 
is specified by a span (m:M -<= MN, n : MN =^ N) (Fig. [9]), whose legs are Kleisli mappings of the 
respective query monad. 

Now A-translation of an M-model A can be done in two 
steps. First, view m is executed (via its Cartesian lifting ac- 
tually going down in the figure), and we obtain Kleilsi arrow 
JnX : A -4= R (with R = A \ m ). Next we need to find an A-model 
B such that its view along n, B\ n , is equal to R. In other words, 
given a view, we are looking for a source providing this view. 
There are many such sources, and to achieve uniqueness, we 
need to choose some policy. Afterwards, we compute model B 
related to A by span (mX,nij). 

If model A is updated to A', it is reasonable to compute a 
corresponding update b: B o B' rather than recompute B' from 
scratch (recall that models can contain thousands elements). 
Computing b again consists of two steps shown in the figure. 

Operations Get"' and Put" are similar to fPpg and bPpg considered above, but work in the asymmetric 
situation when mappings m and n are total (Kleisli) functions and hence view R contains nothing new 
wrt. M and N. Because of asymmetry, operations Get ('get' the view update) and Put ('put' it back to 
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Figure 9: Model transformation via 
GetPut-decomposition 



2 A difference is that model matching usually refers to relating independently developed models, while models to be aligned 
are often connected by a given transformation. 
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the source) are different. Get" 1 is uniquely determined by the view definition m. Put" needs, in addition 
to n, some update propagation policy. After the latter is chosen, we can realize transformation from M 
to N incrementally by composition fPpg = Get"'; Put" — this is an imprecise linear notation for tiling 
(composition of diagram operations) specified in Fig. [9] 

Note that the initial transformation from M to N sending, first, an M-instance A to its view R = A \ m , 
and then finding an A-instance B £ N such that B\ n =R, can be also captured by Get and Put. For this, 
we need to postulate initial objects Q.m and Q.^ in categories of M- and A-instances, so that for any A 
over M and B over N there are unique updates Oa : O-m — > A and Ob : £In — > B. Moreover, there is a unique 
span (otq : Q.m <= Qmn, n& : Q-mn => &n) relating these initial objects. Now, given a model A, model B 
can be computed as B' in Fig.|9]with the upper span being (ma,no_), and the update a being Oa : Qm — > A. 

The backward transformation is defined similarly by swapping the roles of m and n: 

bPpg = Get";Put m . 

The schema described above can be seen as a general pattern for defining model transformation 
declaratively with all benefits (and all pains) of having a precise specification before the implementation 
is approached (and must obey). Moreover, this schema can provide some semantic guarantees in the 
following way. Within the tile algebra framework, laws for operations Get and Put, and their interaction 
(invertibility), can be precisely specified [22] (see also the discussion in Section 5.1); algebras of this 
theory are called delta lenses. Then we can deduce the laws for the composed operations fPpg and bPpg 
from the delta lens laws. Also, operations Get m , Put m can themselves be composed from smaller blocks, 
if the view m is composed: m = m\;m.2;...;mk, via sequential lens composition. In this way, a complex 
model transformation is assembled from elementary transformation blocks, and its important semantic 
properties are guaranteed. More examples and details can be found in |fl"5l. 



5 Applying CT to MDE: Examples and Discussion. 

We will try to exemplify and discuss three ways in which CT can be applied in MDE. The first one — 
gaining a deeper understanding of an engineering problem — is standard, and appears as a particular 
instantiation of the general case of CT's employment in applied domains. The other two are specific to 
SE: structural patterns provided by categorical models of the software system to be built can directly 
influence the design. We will use models of BX as our main benchmark; other examples will be also 
used when appropriate. 



5.1 Deeper understanding. As mentioned in Sect. 4.2 stating algebraic laws that BX procedures must 



obey is practically important as it provides semantic guaranties for synchronization procedures. More- 
over, formulation of these laws should be semantically transparent and concise as the user of synchro- 
nization tools needs a clear understanding of propagation semantics. The original state-based theory of 
asymmetric BX ||28ll considered two groups of laws: invertibility (or round-tripping) laws, GetPut and 
PutGet, and history ignorance, PutPut. Two former laws say that two propagation operations, Get and 
Put, are mutually inverse. The PutPut law says that if a complex update is decomposed into consecutive 
pieces, it can be propagated incrementally, one piece after the other. A two-sorted algebra comprising 
two operations, Get and Put, satisfying the laws, is called a well-behaved lens. 

Even an immediate arrow-based generalization of lenses to delta lenses (treated in elementary terms 
via tile algebra lfT31 1221 ) revealed that the GetPut law is a simple law of identity propagation, Id Put, 
rather than of round-tripping. The benefits of renaming GetPut as Id Put are not exhausted by clarifica- 
tion of semantics: as soon as we understand that the original GetPut is about identity propagation, we at 
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once ask what the real round-tripping law GetPut should be, and at once see that operation Put is not 
the inverse of Get. We only have the weaker 1.5-round-tripping GetPutGet law (or weak invertibility; 
see the Appendix, where the laws in question are named IdPpg and fbfPpg and bfbPpg). It is interesting 
(and remarkable) that papers lPT4ll3Tll . in which symmetric lenses are studied in the state-based setting, 
mistakenly consider identity propagation laws as round-tripping laws, and correspondingly analyze a 
rather poor BX-structure without real round-tripping laws at all. 

The tile algebra formulation of the Put Put law clarified its meaning as a composition preservation 
law |[T5ll22l . but did not solve the enigmatic PutPut problem. The point is that PutPut does not hold 
in numerous practically interesting situations, but its entire removal from the list of BX laws is also 
not satisfactory, as it leaves propagation procedures without any constraints on their compositionality. 
The problem was solved, or at least essentially advanced, by a truly categorical analysis performed by 
Michael Johnson et al Il35l 1341 . They have shown that an asymmetric well-behaved lens is an algebra 
for some KZ monad, and PutPut is nothing but the basic associativity condition for this algebra. Hence, 
as Johnson and Rosebrugh write in P4l . the status of the PutPut changes from being (a) "some law 
that may have arisen from some special applications and should be discarded immediately if it seems 
not to apply in a new application" to (b) a basic requirement of an otherwise adequate and general 
mathematical model. And indeed, Johnson and Rosebrugh have found a weaker — monotonic — version 



Math. Model 



M-to-X 



of PutPut (see Fig. 13 in the Appendix), which holds in a majority of practical applications, including 
those where the original (non-monotonic or mixed) PutPut fails. Hopefully, this categorical analysis can 
be generalized for the symmetric lens case, thus stating solid mathematical foundations for BX. 

5.2 Design patterns for specific problems. 

Recalling Figure 3, Figure [TO] presents a rough illustra- 
tion of how mathematical models can reshape our view of x-to-M 
a domain or construct X. Building a well-structured mathe- 
matical model M of X, and then reinterpreting it back to X, 
can change our view of the latter as schematically shown in 

the figure with the reshaped construct X'. Note the discrep- ' ' \ /'"" i , — \—M 

ancy between the reshaped X' and model M: the upper-left y : -\1"f 
block is missing from X'. If X is a piece of reality (think of / > 
mathematical modeling of physical phenomena), this dis- / s 
crepancy means, most probably, that the model is not ade- V I 

quate (or, perhaps, some piece of X is not observable). If 
X is a piece of software, the discrepancy may point to a de- 
ficiency of the design, which can be fixed by redesigning 
the software. Even better to base software design on a well- 
structured model from the very beginning. Then we say that model M provides a design pattern for 
X. 

We have found several such cases in our work with categorical modeling of MDE-constructs. For 
example, the notion of a jointly-monic n-ary arrow span turns out to be crucial for modeling associations 
between object classes, and their correct implementation as well ifPTll . It is interesting to observe how 
a simple arrow arrangement allows one to clean the UML metamodel and essentially simplify notation 
lfl2l [H. Another example is modeling intermodel mappings by Kleisli morphisms, which provide a 
universal pattern for model matching (a.k.a alignment) and greatly simplify model merge as discussed 
in Sect. 4.1 In addition, the Kleisli view of model mappings provides a design pattern for mapping 
composition — a problem considered to be difficult in the model management literature [3]. Sequential 



Figure 10: From mathematical models to 
design patterns 
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composition of symmetric delta lenses is also not evident; considering such lenses as algebras whose 
carriers are profunctors (see Appendix) suggests a precise pattern to be checked (this work is now in 
progress). Decomposition of a model transformation into Cartesian lifting (view execution) followed 
by the inverse operation of Cartesian lifting completion (view updating) as described in Section 4.2.3 
provides a useful guidance for model transformation design, known to be laborious and error-prone. In 
particular, it immediately provides bidirectionality. 

The graph transformation community also developed several general patterns applicable to MDE 
(with models considered as typed attributed graphs, see 11231 for details). In particular, an industrial stan- 
dard for model transformation, QVT [41 J, was essentially influenced by triple-graph grammars (TGGs). 
Some applications of TGGs to model synchronization (and further references) can be found in ll30ll . 



5.3 Diagrammatic modeling culture and tool architecture. 

The design patterns mentioned above are based on the respective categorical machinery (monads, 
fibrations, profunctors). A software engineer not familiar with these patterns would hardly recognize 
them in the arrays of implementation details. Even less probable is that he will abstract away his im- 
plementation concerns and reinvent such patterns from scratch; distillation of these structures by the CT 
community took a good amount of time. In contrast, simple arrow diagrams, like in Fig. |7Ja) (see also 
the Appendix), do not actually need any knowledge of CT: all that is required is making intermodel 
relations explicit, and denoting them by arcs (directed or undirected) connecting the respective objects. 
To a lesser extent, this also holds for the model transformation decomposition in Fig. [9] and the model 
merge pattern in Fig. [6] We refer to a lesser extent because the former pattern still needs familiarity with 
the relations-are-spans idea, and the latter needs an understanding of what colimit is (but, seemingly, it 
should be enough to understand it roughly as some algebraic procedure of "merging things"). 

The importance of mappings between models/software artifacts is now well recognized in many com- 
munities within SE, and graphical notations have been employed in SE for a long time. Nevertheless, 
a majority of model management tools neglect the primary status of model mappings: in their archi- 
tecture, model matching and alignment are hidden inside (implementations of) algebraic routines, thus 
complicating both semantics and implementation of the latter; concerns are intricately mixed rather than 
separated. As all SE textbooks and authorities claim separation of concerns to be a fundamental principle 
of software design, an evident violation of the principle in the cases mentioned above is an empirical fact 
that puzzles us. It is not clear why a BX-tool designer working on tool architecture does not consider 
simple arrow diagrams like in Fig. |7Ja), and prefers discrete diagrams (b). The latter are, of course, 
simpler but their simplicity is deceiving in an almost evident way. 

The only explanation we have found is that understanding the deceiving simplicity of discrete di- 
agrams (b), and, simultaneously, manageability of arrow diagrams (a), needs a special diagrammatic 
modeling culture that a software engineer normally does not possess. This is the culture of elemen- 
tary arrow thinking, which covers the most basic aspects of manipulating and using arrow diagrams. 
It appears that even elementary arrow thinking habits are not cultivated in the current SE curriculum, 
the corresponding high-level specification patterns are missing from the software designer toolkit, and 
software is often structured and modularized according to the implementation rather than specification 
concerns. 
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6 Related work 

First applications of CT in computer science, and the general claim of CT's extreme usefulness for 
computer applications should be, of course, attributed to Joseph Goguen |29l . The shift from modeling 
semantics of computation (behavior) to modeling structures of software programs is emphasized by Jose 
Fiadeiro in the introduction to his book ll36l . where he refers to a common "social" nature of both 
domains. The ideas put forward by Fiadeiro were directly derived from joint work with Tom Maibaum 
on what has become known as component based design and software architecture Il26ll27ll24"l . A clear 
visualization of these ideas by Fig.[2](wifh M standing for Fiadeiro's "social") seems to be new. The idea 
of round-tripping modeling chain Fig. [3] appears to be novel, its origin can be traced to iPTTl . 

Don Batory makes an explicit call to using CT in MDE in his invited lecture for MoDELS'2008 1H, 
but he employs the very basic categorical means, in fact, arrow composition only. In our paper we refer 
to much more advanced categorical means: sketches, fibrations, Cartesian monads, Kleisli categories. 

Generalized sketches (graphs with diagram predicates) as a universal syntactical machinery for for- 
malizing different kinds of models were proposed by Diskin et al, lfl"8l . Their application to special 
MDE problems can be found in ifTTl [38ll and in the work of Rutle et al, see fl6l . B3l and references 
therein. A specific kind of sketches, ER-sketches, is employed for a number of problems in the database 
context by Johnson et al ll32l . Considering models as typed attributed graphs with applications to MDE 
has been extensively put forward by the graph transformation (GT) community |23l ; their work is much 
more operationally oriented than our concerns in the present paper. On the other hand, in contrast to the 
generalized sketches framework, constraints seem to be not the first-class citizens in the GT world. 

The shift from functorial to fibrational semantics for sketches to capture the metamodeling founda- 
tions of MDE was proposed in |[T3l and formalized in j20l . This semantics is heavily used in [15], and 
in the work of Rutle et al mentioned above. Comparison of the two semantic approaches, functorial and 
fibrational, and the challenges of proving their equivalence, are discussed in ll52l . 

The idea of modeling query languages by monads, and metamodel (or data schema) mappings by 
Kleisli mappings, within the functorial semantics approach, was proposed in lfl6l . and independently 
by Johnson and Rosebrugh in their work on ER-sketches |[32l . Reformulation of the idea for fibrational 
semantics was developed and used for specifying important MDE constructs in |[T5l I2D . An accurate 
formalization via Cartesian monads can be found in [ 19 ]. 

Algebraic foundations for BX is now an area of active research. Basics of the state-based algebraic 
framework (lenses) were developed by Pierce with coauthors [28]; their application to MDE is due to 
Stevens Il50l . Delta-lenses |[22l[T0ll is a step towards categorical foundations, but they have been described 
in elementary terms using tile algebra Ifl5l . A categorical approach to the view update problem has 
been developed by Johnson and Rosebrugh et q/ ll33ll ; and extended to categorical foundations for lenses 
based on KZ-monads in j35l l34l . The notion of symmetric delta lens in Appendix is new; it results 
from incorporating the monotonic PutPut-law idea of Johnson and Rosebrugh into the earlier notion of 
symmetric delta lens ifTOl . Assembling synchronization procedures from elementary blocks is discussed 
inlBl. 

7 Conclusion 

The paper claims that category theory is a good choice for building mathematical foundations for MDE. 
We first discuss two very general prerequisites that concepts and structures developed in category theory 
have to be well applicable for mathematical modeling of MDE-constructs. We then exemplify the argu- 
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ments by sketching several categorical models, which range from general definitions of multimodeling 
and intermodeling to important model management scenarios of model merge and bidirectional update 
propagation. We briefly explain (and refer to other work for relevant details) that these categorical models 
provide useful design patterns and guidance for several problems considered to be difficult. 

Moreover, even an elementary arrow arrangement of model merge and BX scenarios makes explicit 
a deficiency of the modern tools automating these scenarios. To wit: these tools' architecture weaves 
rather than separates such different concerns as (i) model matching and alignment based on heuristics and 
contextual information, and (ii) relatively simple algebraic routines of merging and update propagation. 
This weaving complicates both semantics and implementation of the algebraic procedures, does not allow 
the user to correct alignment if necessary, and makes tools much less flexible. It appears that even simple 
arrow patterns, and the corresponding structural decisions, may not be evident for a modern software 
engineer. 

Introduction of CT courses into the SE curriculum, especially in the MDE context, would be the 
most natural approach to the problem: even elementary CT studies should cultivate arrow thinking, 
develop habits of diagrammatic reasoning and build a specific intuition of what is a healthy vs. ill- 
formed structure. We believe that such intuition, and the structural lessons one can learn from CT, are of 
direct relevance for many practical problems in MDE. 
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A Appendix. Algebra of bidirectional update propagation 

In Section |4~2) we considered operations of update propagation, but did not specify any laws they must 
satisfy. Such laws are crucial for capturing semantics, and the present section aims to specify algebraic 
laws for BX. We will do it in an elementary way using tile algebra (rather than categorically — it is non- 
trivial and left for a future work). We will begin with the notion of an alignment framework to formalize 



delta composition (* in Section 4.2 1, and then proceed to algebraic structures modeling BX — symmetric 



delta lenses. (Note that the lenses we will introduce here are different from those defined in [ 10].) 
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Figure 1 1 : Realignment operations and their laws 



Definition 1 () An alignment framework is given by the following data. 

(i) Two categories with pullbacks, A and B, called model spaces. We will consider spans in these 
categories up to their equivalence via a head isomorphism commuting with legs. That is, we will work 
with equivalence classes of spans, and the term 'span' will refer to an equivalence class of spans. Then 
span composition (via pullbacks) is strictly associative, and we have categories (rather than bicategories) 
of spans, Spani(A) and Spani(B). Their subcategories consisting of spans with injective legs will be 
denoted by A* and B* resp. 

Such spans are to be thought of as (model) updates. They will be depicted by vertical bi-directional 
arrows, for example, a and b in the diagrams Fig. [TTJa). We will assume that the upper node of such 
an arrow is its formal source, and the lower one is the target; the source is the the original (state of the) 
model, and the target is the updated model. Thus, model evolution is directed down. 

A span whose upper leg is identity (nothing deleted) is an insert update; it will be denoted by unidi- 
rectional arrows going down. Dually, a span with identity lower leg is a delete update; it will be denoted 
by a unidirectional arrow going up (but the formal source of such an arrow is still the upper node). 

(ii) For any two objects, AG Ao and B G Bo, there is a set R(A,B) of correspondences (or corrs in 
short) from A to B. Elements of R(A,B) will be depicted by bi-directional horizontal arrows, whose 
formal source is A and the target is B. 

Updates and corrs will also be called vertical and horizontal deltas, resp. 

(iii) Two diagram operations over corrs and updates called forward and backward (re)alignment. 



Their arities are shown in Fig. 1 1 a) (output arrows are dashed). We will also write a * r for fAln(a, r) 
and r*b for bAln(ft,r). We will often skip the prefix 're' and say 'alignment' to ease terminology. 

There are three laws regulating alignment. Identity updates do not actually need realignment: 



(IdAIn) 



idA * r = r = r * \dB 



for any corr r : A o B. 

The result of applying a sequence of interleaving forward and backward alignments does not depend 
on the order of application as shown in Fig.[TT|b): 



(fAln-bAIn) 



(a * r) 



a*(r*b) 



for any corr r and any updates a, b. 

We will call diagrams like those shown in Fig. [TT|a,b) commutative if the arrow at the respective 
operation output is indeed equal to that one computed by the operation. For example, diagram (b) is 
commutative if r 1 = a * r * b. 
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Figure 12: Operations of update propagation 



Finally, alignment is compositional: for any consecutive updates a : A — > A', a' : A' — > A", b : B — > B', 
b':B' -»• B", the following holds: 

(AlnAIn) a * (a * r) = (a;a) * r and (r*b)*b' = r* (b;b') 

where ; denotes sequential span composition. 

It is easy to see that having an alignment framework amounts to having a functor a : A* x B* — > Set. 

Definition 2 () A symmetric delta lens (briefly, an sd-lens) is a triple 

(a, fPpg, bPpg) with a : A* x B* — > Set an alignment framework, and fPpg, bPpg two diagram opera- 
tions over corrs and updates (called forward and backward update propagation, resp.)- The arities are 
specified in Fig.fT^a) with output arrows dashed and output nodes not framed. Sometimes we will use a 
linear notation and write b = a.fPpg(r) and a = ft.bPpg(r) for the cases specified in the diagrams. 

Each operation must satisfy the following laws. 

Stability or IdPpg law: if nothing changes on one side, nothing happens on the other side as well, 
that is, identity mappings are propagated into identity mappings as shown by diagrams Fig. [T2jb). 

Monotonicity: Insert updates are propagated into inserts, and delete updates are propagated into 
deletes, as specified in Fig.[T2]c). 

Monotonic Compositionality or PpgPpg law: composition of two consecutive inserts is propagated 
into composition of propagations as shown by the left diagram in Fig.[T3](to be read as follows: if the two 
squares are fPpg, then the outer rectangle is fPpg as well). The right diagram specifies compositionality 
for deletes. The same laws are formulated for bPpg. 

Note that we do not require compositionality for propagation of general span updates. The point is 
that interleaving inserts and deletes can annihilate, and lost information cannot be restored: see (28 
[TOl for examples. 



Commutativity: Diagrams Fig. 12 a) must be commutative in the sense that a*r*b = r . 

Finally, forward and backward propagation must be coordinated with each other by some invertibility 
law. Given a corr r : A — > B, an update a : A — > A' is propagated into update b = o'.fPpg(r), which can be 
propagated back to update a' = ft.bPpg(r). For an ideal situation of strong invertibility, we should require 
a' = a. Unfortunately, this does not hold in general because the A*-specific part of the information is lost 
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Figure 14: Round-tripping laws. (Scenario in diagram (b) "runs" from the right to the left.) 



in passing from a to b, and cannot be restored [10]. However, it makes sense to require the following weak 



invertibility specified in Fig. 14 which does hold in a majority of practically interesting situations, e.g., 
for BX determined by TGG-rules lf3Qj . The law fbfPpg says that although a\ = a.fPpg(r).bPpg(r) ^ a, 
a\ is equivalent to a in the sense that ai.fPpg(r) = a.fPpg(r). Similarly for the bfbPpg law. 

The notion of sd-lens is specified above in elemen- 
tary terms using tile algebra. Its categorical underpin- 
ning is not evident, and we only present several brief 
remarks. 

1) An alignment framework a : A* x B* — > Set can 
be seen as a profunctor, if A-arrows will be considered 
directed up (i.e., the formal source of update a in dia- 
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Figure 13: Monotonic (PpgPpg) laws 



gram Fig. 1 1 a) is A', and the target is A). Then align- 
ment amounts to a functor a : A* op x B* — > Set, that is, 
a profunctor a : B* A*. Note that reversing arrows 
in A* actually changes the arity of operation fAIn: now 
its input is a pair (a,r) with a an update and r a corr 
from the target of a, and the output is a corr r' from the source of a, that is, realignment goes back in 
time. 

2) Recall that operations fPpg and bPpg are functorial wrt. injective arrows in A, B, not wrt. arrows 
in A*, B*. However, if we try to resort to A, B entirely and define alignment wrt. arrows in A, B, then we 
will need two fAIn operations with different arities for inserts and deletes, and two bAIn operations with 
different arities for inserts and deletes. We will then have four functors a* : A x B — > Set with i ranging 
over four-element set {insert, delete} x {A,B}. 

3) The weak invertibility laws suggest that a Galois connection/adjunction is somehow hidden in 
sd-lenses. 

4) Working with chosen spans and pullbacks rather than with their equivalence classes provides a 
more constructive setting (given we assume the axiom of choice), but then associativity of span compo- 
sition only holds up to chosen natural isomorphisms, and A* and B* have to be considered bicategories 
rather than categories. 

All in all, we hope that the categorical analysis of asymmetric delta lenses developed by Johnson et 
al J35][34l could be extended to capture the symmetric case too. 



